refactor(core): rename benchmark → project for registry + sync (1/4)#1242
Merged
Conversation
Internal-only rename (PR 1 of 4). The user-facing "benchmark" terminology in HTTP routes (/api/benchmarks/...), JSON field names (benchmark_id, benchmark_name), CLI flags, Studio components, and docs is unchanged in this PR — those land in PR 2 (HTTP API), PR 3 (Studio frontend), and PR 4 (docs). Renamed: - packages/core/src/benchmarks.ts → projects.ts - packages/core/src/benchmark-sync.ts → project-sync.ts - BenchmarkEntry → ProjectEntry, BenchmarkSource → ProjectSource, BenchmarkRegistry → ProjectRegistry - loadBenchmarkRegistry → loadProjectRegistry, saveBenchmarkRegistry → saveProjectRegistry, addBenchmark → addProject, removeBenchmark → removeProject, getBenchmark → getProject, touchBenchmark → touchProject, discoverBenchmarks → discoverProjects, deriveBenchmarkId → deriveProjectId, getBenchmarksRegistryPath → getProjectsRegistryPath, syncBenchmark → syncProject, syncBenchmarks → syncProjects - ~/.agentv/benchmarks.yaml → projects.yaml, top-level key `benchmarks:` → `projects:` One-time migration: - loadProjectRegistry() calls migrateLegacyBenchmarksFile() before reading the registry. If only benchmarks.yaml exists, it is read, transformed (top-level key rewritten), written to a temp file, atomically renamed to projects.yaml, and the legacy file is unlinked. If both files exist, projects.yaml wins and a warning is logged. Idempotent: subsequent loads are a no-op. Rationale: 5 of 6 LLM observability tools (Phoenix, Langfuse, Braintrust, W&B Weave, LangSmith) use "project" for the container that holds eval runs, traces, datasets, and other telemetry. agentv is adding trace/span/latency capture alongside eval runs, making "benchmark" too narrow. The rename also disambiguates from the academic "benchmark = eval suite" usage that survives in example directory names (benchmark-tooling, multi-model-benchmark, etc.). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
5d84e67
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://68177b2b.agentv.pages.dev |
| Branch Preview URL: | https://refactor-rename-benchmark-to.agentv.pages.dev |
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 1 of 4 in the benchmark → project rename. Scope: internal
@agentv/coresymbols only. Wire formats (HTTP routes, JSON field keys, CLI flag descriptions, Studio routes/components, docs) are unchanged in this PR and will land in:Renames
One-time legacy file migration
`loadProjectRegistry()` calls `migrateLegacyBenchmarksFile()` before reading the registry. Four state transitions handled:
The temp+rename pattern keeps `projects.yaml` from ever being half-written; the legacy file is only removed after the rename succeeds.
Why "project"
5 of 6 LLM observability tools (Phoenix, Langfuse, Braintrust, W&B Weave, LangSmith) use `project` for the container that holds eval runs, traces, and datasets. agentv is adding trace/span/latency capture alongside eval runs, making "benchmark" too narrow. The rename also disambiguates from the academic "benchmark = eval suite" usage that's retained in example directory names (`benchmark-tooling`, `multi-model-benchmark`, etc.) — those genuinely are benchmark suites and stay named that way.
Test plan
```
$ ls /tmp/uat-rename-home/.agentv/
benchmarks.yaml # legacy fixture with 2 entries (alpha + beta with source)
$ HOME=/tmp/uat-rename-home bun -e "import { loadProjectRegistry } from '@agentv/core';
const r = loadProjectRegistry(); console.log(r.projects.map(p => p.id));"
[agentv] Migrated registry: benchmarks.yaml → projects.yaml (2 entries)
[ "alpha", "beta" ]
$ ls /tmp/uat-rename-home/.agentv/
projects.yaml # legacy file gone, content preserved including .source
Second load → silent no-op
$ HOME=/tmp/uat-rename-home bun -e "...loadProjectRegistry()..."
[ "alpha", "beta" ] # no migration log line — idempotent
Both-files conflict → projects.yaml wins, warning emitted
$ # (re-create benchmarks.yaml alongside the new projects.yaml)
$ HOME=/tmp/uat-rename-home bun -e "...loadProjectRegistry()..."
[agentv] Both .../.agentv/benchmarks.yaml and .../.agentv/projects.yaml exist.
Using projects.yaml; delete benchmarks.yaml when you've confirmed the new file is correct.
[ "alpha", "beta" ]
```
The 4 new migration tests in `packages/core/test/projects.test.ts` cover the same three transitions plus the fresh-install no-op.
Notes on what's intentionally NOT renamed in this PR
🤖 Generated with Claude Code